MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
نویسندگان
چکیده
We introduce a large dataset of narrative texts and questions about these texts, intended to be used in a machine comprehension task that requires reasoning using commonsense knowledge. Our dataset complements similar datasets in that we focus on stories about everyday activities, such as going to the movies or working in the garden, and that the questions require commonsense knowledge, or more specifically, script knowledge, to be answered. We show that our mode of data collection via crowdsourcing results in a substantial amount of such inference questions. The dataset forms the basis of a shared task on commonsense and script knowledge organized at SemEval 2018 and provides challenging test cases for the broader natural language understanding community.
منابع مشابه
A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis
BACKGROUND With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elem...
متن کاملCliCR: A Dataset of Clinical Case Reports for Machine Reading Comprehension
We present a new dataset for machine comprehension in the medical domain. Our dataset uses clinical case reports with around 100,000 gap-filling queries about these cases. We apply several baselines and state-of-the-art neural readers to the dataset, and observe a considerable gap in performance (20% F1) between the best human and machine readers. We analyze the skills required for successful a...
متن کاملThe Effectiveness of Shadow-Reading With and Without Written Script on Listening Comprehension of Iranian Intermediate EFL Students.
Listening comprehension is at the heart of language learning (Kurita, 2012). It is an importantlanguage skill to develop in terms of second language acquisition (SLA) (Dunkel, 1991; Rost,2001; Vandergrift, 2007).In spite of its importance, L2 learners often regard listening as themost difficult language skill to learn. In this study, shadowing as an act or task in listening, inwhich the learner...
متن کاملMCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
We present MCTest, a freely available set of stories and associated questions intended for research on the machine comprehension of text. Previous work on machine comprehension (e.g., semantic modeling) has made great strides, but primarily focuses either on limited-domain datasets, or on solving a more restricted goal (e.g., open-domain relation extraction). In contrast, MCTest requires machin...
متن کاملTwo-Stage Synthesis Networks for Transfer Learning in Machine Comprehension
We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network (SynNet). Given a high-performing MC model in one domain, our technique aims to answer questions about documents in another domain, where we use no labeled data of question-answer pairs. Using the proposed SynNet with a pretrained model from the SQuAD dataset on the challenging N...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018